A Hierarchy-Aware Pose Representation for Deep Character Animation

Data-driven character animation techniques rely on the existence of a properly established model of motion, capable of describing its rich context. However, commonly used motion representations often fail to accurately encode the full articulation of motion, or present artifacts. In this work, we address the fundamental problem of finding a robust pose representation for motion modeling, suitable for deep character animation, one that can better constrain poses and faithfully capture nuances correlated with skeletal characteristics. Our representation is based on dual quaternions, the mathematical abstractions with well-defined operations, which simultaneously encode rotational and positional orientation, enabling a hierarchy-aware encoding, centered around the root. We demonstrate that our representation overcomes common motion artifacts, and assess its performance compared to other popular representations. We conduct an ablation study to evaluate the impact of various losses that can be incorporated during learning. Leveraging the fact that our representation implicitly encodes skeletal motion attributes, we train a network on a dataset comprising of skeletons with different proportions, without the need to retarget them first to a universal skeleton, which causes subtle motion elements to be missed. We show that smooth and natural poses can be achieved, paving the way for fascinating applications.

データ駆動型キャラクタアニメーション技術は、豊かな文脈を記述することができる、適切に確立されたモーションモデルの存在に依存している。しかし、一般的に使用されているモーション表現は、しばしばモーションの完全なアーティキュレーションを正確に符号化できなかったり、アーチファクトを呈したりする。そこで、本研究では、ポーズをより適切に拘束し、骨格の特徴に関連するニュアンスを忠実に捉えることができる、深いキャラクタアニメーションに適したモーションモデリング用の頑健なポーズ表現を見つけるという基本問題に取り組んでいます。この表現は、回転と位置の向きを同時に符号化し、ルートを中心とした階層を意識した符号化を可能にする、明確に定義された演算を持つ数学的抽象化であるデュアルクォータニオンに基づくものである。我々は、我々の表現が一般的なモーションアーチファクトを克服することを実証し、他の一般的な表現と比較して、その性能を評価する。また、学習中に組み込むことが可能な様々な損失の影響を評価するために、アブレーション研究を行う。我々の表現が暗黙的に骨格の運動属性を符号化していることを利用して、我々は、異なる比率の骨格からなるデータセットに対してネットワークを学習させるが、その際、微妙な運動要素を見逃す原因となるユニバーサルスケルトンに最初にターゲットを合わせる必要はない。その結果、滑らかで自然なポーズを実現できることを示し、魅力的な応用への道を開く。